Goto

Collaborating Authors

 Bingoel Province





Learning filter widths of spectral decompositions with wavelets

Haidar Khan, Bulent Yener

Neural Information Processing Systems

Time series classification using deep neural networks, such as convolutional neural networks (CNN), operate on the spectral decomposition of the time series computed using a preprocessing step.


Detection of Suicidal Risk on Social Media: A Hybrid Model

Yang, Zaihan, Leonard, Ryan, Tran, Hien, Driscoll, Rory, Davis, Chadbourne

arXiv.org Artificial Intelligence

--Suicidal thoughts and behaviors are increasingly recognized as a critical societal concern, highlighting the urgent need for effective tools to enable early detection of suicidal risk. In this work, we develop robust machine learning models that leverage Reddit posts to automatically classify them into four distinct levels of suicide risk severity. We frame this as a multi-class classification task and propose a RoBERT a-TF-IDF-PCA Hybrid model, integrating the deep contextual embeddings from Robustly Optimized BERT Approach (RoBERT a), a state-of-the-art deep learning transformer model, with the statistical term-weighting of TF-IDF, further compressed with PCA, to boost the accuracy and reliability of suicide risk assessment. T o address data imbalance and overfitting, we explore various data resampling techniques and data augmentation strategies to enhance model generalization. Additionally, we compare our model's performance against that of using RoBERT a only, the BERT model and other traditional machine learning classifiers. Suicidal thoughts and behaviors are increasingly becoming a significant societal concern. As of the latest estimates from World Health Organization, approximately 700,000 to 800,000 people die by suicide globally each year. In the U.S., suicide is the second leading cause of death for individuals aged 10-34 and the fourth leading cause for those aged 35-64. Suicidal thoughts can vary in severity, ranging from explicit and repetitive suicidal feelings, to actively planning suicide or engaging in self-harm behaviors like cutting or burning, and ultimately to making actual attempts through methods such as cutting, jumping, drug overdose, or using firearms.


A multi-head deep fusion model for recognition of cattle foraging events using sound and movement signals

Ferrero, Mariano, Chelotti, José Omar, Martinez-Rau, Luciano Sebastián, Vignolo, Leandro, Pires, Martín, Galli, Julio Ricardo, Giovanini, Leonardo Luis, Rufiner, Hugo Leonardo

arXiv.org Artificial Intelligence

Monitoring feeding behaviour is a relevant task for efficient herd management and the effective use of available resources in grazing cattle. The ability to automatically recognise animals' feeding activities through the identification of specific jaw movements allows for the improvement of diet formulation, as well as early detection of metabolic problems and symptoms of animal discomfort, among other benefits. The use of sensors to obtain signals for such monitoring has become popular in the last two decades. The most frequently employed sensors include accelerometers, microphones, and cameras, each with its own set of advantages and drawbacks. An unexplored aspect is the simultaneous use of multiple sensors with the aim of combining signals in order to enhance the precision of the estimations. In this direction, this work introduces a deep neural network based on the fusion of acoustic and inertial signals, composed of convolutional, recurrent, and dense layers. The main advantage of this model is the combination of signals through the automatic extraction of features independently from each of them. The model has emerged from an exploration and comparison of different neural network architectures proposed in this work, which carry out information fusion at different levels. Feature-level fusion has outperformed data and decision-level fusion by at least a 0.14 based on the F1-score metric. Moreover, a comparison with state-of-the-art machine learning methods is presented, including traditional and deep learning approaches. The proposed model yielded an F1-score value of 0.802, representing a 14% increase compared to previous methods. Finally, results from an ablation study and post-training quantization evaluation are also reported.


Design and Analysis of an Extreme-Scale, High-Performance, and Modular Agent-Based Simulation Platform

Breitwieser, Lukas Johannes

arXiv.org Artificial Intelligence

Agent-based modeling is indispensable for studying complex systems across many domains. However, existing simulation platforms exhibit two major issues: performance and modularity. Low performance prevents simulations with a large number of agents, increases development time, limits parameter exploration, and raises computing costs. Inflexible software designs motivate modelers to create their own tools, diverting valuable resources. This dissertation introduces a novel simulation platform called BioDynaMo and its significant improvement, TeraAgent, to alleviate these challenges via three major works. First, we lay the platform's foundation by defining abstractions, establishing software infrastructure, and implementing a multitude of features for agent-based modeling. We demonstrate BioDynaMo's modularity through use cases in neuroscience, epidemiology, and oncology. We validate these models and show the simplicity of adding new functionality with few lines of code. Second, we perform a rigorous performance analysis and identify challenges for shared-memory parallelism. Provided solutions include an optimized grid for neighbor searching, mechanisms to reduce the memory access latency, and exploiting domain knowledge to omit unnecessary work. These improvements yield up to three orders of magnitude speedups, enabling simulations of 1.7 billion agents on a single server. Third, we present TeraAgent, a distributed simulation engine that allows scaling out the computation of one simulation to multiple servers. We identify and address server communication bottlenecks and implement solutions for serialization and delta encoding to accelerate and reduce data transfer. TeraAgent can simulate 500 billion agents and scales to 84096 CPU cores. BioDynaMo has been widely adopted, including a prize-winning radiotherapy simulation recognized as a top 10 breakthrough in physics in 2024.


From Features to Transformers: Redefining Ranking for Scalable Impact

Borisyuk, Fedor, Hertel, Lars, Parameswaran, Ganesh, Srivastava, Gaurav, Ramanujam, Sudarshan Srinivasa, Ocejo, Borja, Du, Peng, Akterskii, Andrei, Daftary, Neil, Tang, Shao, Sun, Daqi, Xiao, Qiang Charles, Nathani, Deepesh, Kothari, Mohit, Dai, Yun, Gupta, Aman

arXiv.org Artificial Intelligence

We present LiGR, a large-scale ranking framework developed at LinkedIn that brings state-of-the-art transformer-based modeling architectures into production. We introduce a modified transformer architecture that incorporates learned normalization and simultaneous set-wise attention to user history and ranked items. This architecture enables several breakthrough achievements, including: (1) the deprecation of most manually designed feature engineering, outperforming the prior state-of-the-art system using only few features (compared to hundreds in the baseline), (2) validation of the scaling law for ranking systems, showing improved performance with larger models, more training data, and longer context sequences, and (3) simultaneous joint scoring of items in a set-wise manner, leading to automated improvements in diversity. To enable efficient serving of large ranking models, we describe techniques to scale inference effectively using single-pass processing of user history and set-wise attention. We also summarize key insights from various ablation studies and A/B tests, highlighting the most impactful technical approaches.


Multimodal Machine Learning Can Predict Videoconference Fluidity and Enjoyment

Chang, Andrew, Akkaraju, Viswadruth, Cogliano, Ray McFadden, Poeppel, David, Freeman, Dustin

arXiv.org Artificial Intelligence

Videoconferencing is now a frequent mode of communication in both professional and informal settings, yet it often lacks the fluidity and enjoyment of in-person conversation. This study leverages multimodal machine learning to predict moments of negative experience in videoconferencing. We sampled thousands of short clips from the RoomReader corpus, extracting audio embeddings, facial actions, and body motion features to train models for identifying low conversational fluidity, low enjoyment, and classifying conversational events (backchanneling, interruption, or gap). Our best models achieved an ROC-AUC of up to 0.87 on hold-out videoconference sessions, with domain-general audio features proving most critical. This work demonstrates that multimodal audio-video signals can effectively predict high-level subjective conversational outcomes. In addition, this is a contribution to research on videoconferencing user experience by showing that multimodal machine learning can be used to identify rare moments of negative user experience for further study or mitigation.


The First Multilingual Model For The Detection of Suicide Texts

Zevallos, Rodolfo, Schoene, Annika, Ortega, John E.

arXiv.org Artificial Intelligence

Suicidal ideation is a serious health problem affecting millions of people worldwide. Social networks provide information about these mental health problems through users' emotional expressions. We propose a multilingual model leveraging transformer architectures like mBERT, XML-R, and mT5 to detect suicidal text across posts in six languages - Spanish, English, German, Catalan, Portuguese and Italian. A Spanish suicide ideation tweet dataset was translated into five other languages using SeamlessM4T. Each model was fine-tuned on this multilingual data and evaluated across classification metrics. Results showed mT5 achieving the best performance overall with F1 scores above 85%, highlighting capabilities for cross-lingual transfer learning. The English and Spanish translations also displayed high quality based on perplexity. Our exploration underscores the importance of considering linguistic diversity in developing automated multilingual tools to identify suicidal risk. Limitations exist around semantic fidelity in translations and ethical implications which provide guidance for future human-in-the-loop evaluations.